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Abstract — Let x be a signal to be sparsely decomposed over 
a redundant dictionary A, i.e. a sparse coefficient vector s 
has to be found such that x = As. It is known that this 
problem is inherently unstable against noise, and to overcome 
this instability, the authors of |1| have proposed to use an 
"approximate" decomposition, that is, a decomposition satisfying 
||x As| < 5 rather than satisfying the exact equality x = As. 
Then, they have shown that if there is a decomposition with 
||s| < (1 + i\/^)/2, where M denotes the coherence of the 
dictionary, this decomposition would be stable against noise. 
On the other hand, it is known that a sparse decomposition 
with ||s||o < |sparic(A) is unique. In other words, although a 
decomposition with | s| o < ^spm±{A) is unique, its stability 
against noise has been proved only for highly more restrictive 
decompositions satisfying | s| o < {1 + M~^)/2, because usually 
(l + A/-i)/2< isparJc(A). 

This limitation maybe had not been very important before, 
because | s| o < (l + A/^)/2 is also the bound which guaranties 
that the sparse decomposition can be found via minimizing the 
£^ norm, a classic approach for sparse decomposition. However, 
with the availability of new algorithms for sparse decomposition, 
namely SLO and Robust-SLO, it would be important to know 
whether or not unique sparse decompositions with (l+M^)/2 < 
||s| < ^spark{A) are stable. In this paper, we show that such 
decompositions are indeed stable. In other words, we extend the 
stability bound from | s| o < (l+i\/^)/2tothe whole uniqueness 
range | s| o < |spai±(A). In summary, we show that all unique 
sparse decompositions are stably recoverable. Moreover, we see 
that sparser decompositions are 'more stable'. 

Index Terms — Sparse Signal Decomposition, Sparse recovery. 
Compressed Sensing, Sparse Component Analysis (SCA), Over- 
complete dictionaries. 



Component Analysis (SCA) and source separation ||5l, Q, 
||8j|, atomic decomposition on overcomplete dictionaries ||9l, 
lH], decoding real field codes [10], image deconvolution ifTTl . 
lfT2l . image denoising |fT3l . electromagnetic imaging and Di- 
rection of Arrival (DOA) finding fT4J, etc. 

In atomic decomposition viewpoint fTSl , the columns of A 
are called 'atoms' and the matrix A is called the 'dictionary' 
over which the 'signal' x is to be decomposed. When the 
dictionary is overcomplete (m > n), the representation is 
not unique, but by the sparsest solution, we are looking for 
the representation which uses as small as possible number of 
atoms to represent the signal. 

Sparse solutions of underdetermined linear systems would 
not be useful, unless positive answers can be provided for the 
following three questions: 

1) Uniqueness: Is such a solution unique? 

2) Practical algorithm: Is it practically possible to find 
the sparsest solution of an USLE? 

3) Stability against noise: Doesn't a small amount of noise 
result in a completely different sparse solution? 

In this paper we study the third question, and we generalize 
previously available results. To better explain the problem and 
our contribution, we firstly do a brief review in Section [ll] 
on the available results about the above questions, and then 
explain in subsection |II-D| what our contribution is. We state 
then the main theorem in Section 
result will be stated in Section HV] 



III Finally, a generalized 



I. Introduction 

LET A be an n X m matrix with m > n, and consider the 
Underdetermined System of Linear Equations (USLE) 
As — X. Such a linear system has typically infinitely many 
solutions, but let consider its sparsest solution, that is, a 
solution Sg which has as much as possible zero components. 

This problem has recently attracted a lot of attention from 
many different viewpoints. It is used, for example, in Com- 
pressed Sensing (CS) |l2|, 13], iH), underdetermined Sparse 
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II. Problem statement 

A. Uniqueness ? 

The uniqueness problem has been addressed in lfT4ll . lfT6l . 
fTTl, and it has been shown that if an underdetermined linear 
system has a sparse enough solution, it would be its unique 
sparsest solution. More precisely: 

Theorem 1 (Uniqueness l[16]l , l[17]l }: Let spark(A) denote 
the minimum number of columns of A that are linearly 
dependent, and \\ ■ ||o denotes the P norm of a vector (i.e. 
the number of its non-zero components). Then if the USLE 
As = x has a solution Sq for which ||so||o < ^spark{A), it 
is its unique sparsest solution. 

A special case of this uniqueness theorem has been stated 
in lfT4l : if A has the Unique Representation Property (URP), 
that is, if all n x n submatrices of A are non-singular, then 
spark{A) — n + 1 and hence ||so||o < f implies that Sq is the 
unique sparsest solution. 
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B. Practical Algorithm? 

Finding the sparsest solution of an USLE can be expressed 
as: 

(Po) '■ Minimize ||s||o subject to As ~ x, (1) 

where || • ||o stands for the norm of a vector. Solving 
the above problem requires a combinatorial search and is 
generally NP-hard. Then, many algorithms have been proposed 
to indirectly solve the problem. One of the first and most 
successful ideas is the idea of Basis Pursuit (BP) |,9J, which 
is to replace the above problem by 

(Pi): Minimize II s| 1 1 subject to As = x, (2) 

where ||s||i = I'**! '^he £^ norm of s. Note that the 
problem Pi is convex and can be easily solved by using 
Linear Programming (LP) techniques. Moreover, it has been 
shown that if the sparsest solution Sq is highly sparse, then 
the solution of Pi is also the sparsest solution, i.e. it is also 
the solution of Pp. 

To express this property more precisely, let the columns 
of A be normalized to have unit (Euclidean) norm. Let 
also define the 'coherence' , M, of the dictionary A as the 
maximum correlation between its atoms, that is: 



M = max |af a^l, 
n denote the columns of A. Then: 



(3) 



where a^, i = 1, . 

Theorem 2 {Equivalence of P^) and Pi 4761/ . M 71/ )." // the 
USLE As = X has a solution Sq for which ||so||o < — > 
then it is the unique solution of both problems Pq and Pi. 

In other words, if the sparsest solution satisfies ||so||o < 



l+M" 



it can be found by solving the convex program Pi 



Remark 1. Note that the bound on sparsity that guaranties 
the equivalence of Pq and Pi is highly more restrictive than the 
bound which guaranties the uniqueness of the sparsest solu- 
tion. For example, suppose that the dictionary A is constructed 
by concatenating two orthonormal bases, A = . 'i' ], and 
hence m = 2n. It can be easily shown [161 that in this case the 
maximum possible value for M is l/y^ (this maximum value 
for M is obtained for example for concatenation of a Dirac and 
a Fourier dictionary). Consider for example such a dictionary 
A with m = 1000 and n = 500, which satisfies the URP and 
has the maximum possible coherence M = « 1/(22.36). 
Then, by Theorem [T| a solution Sq with ||so||o < 250 is neces- 
sarily the unique sparsest solution. However, from Theorem |2] 
it is guaranteed that the sparsest solution can be found by Pi 
only where ||so||o < (1 + 22.36)/2, that is ||so||o < H- In 
other words, if there is a solution Sq such that among its 1000 
entries there are at most 250 non-zero entries, it would be 
the unique sparsest solution, but we cannot necessarily find 
it by solving Pi, unless among these 1000 entries, there are 
at most 11 non-zero entries. Consequently, equivalence of Pi 
and Pq holds only for the case there exists a 'very very ' sparse 
solution. 

Remark 2. Note also that if the unique sparsest solution 
satisfies — < ||so||o < ^spark(A), the above theorem 



does not state that it 'cannot' be found by solving Pi; it simply 
does not 'guarantee' that Pi can recover it. In fact, from the 
uniqueness Theorem [T] we know that if we find a solution 
So by using any method (e.g. Pi, or even simply by a magic 
guess), and we see that it happens that ||so||o < ■^spark{A), 
we will know that we have found the unique sparsest solution. 

In addition to the methods based on £^ norm minimization, 
there are other ideas for finding the sparsest solution, for exam- 
ple Matching Pursuit (MP) (15] and Smoothed i° (SLO) fTE\. 
The latter method (SLO), which has been designed in our 
group, tries to directly solve the Pq problem by replacing the 
£^ norm by a smooth approximation of it (and hence the name 
'smoothed' £^). One of the motivations behind SLO is the fact 
stated above: Since the equivalence of Pq and Pi holds only 
where there exist very very sparse solutions, it would probably 
be better trying to solve Pq directly. Another motivation is the 
speed: it has been shown ||181 that SLO is highly faster than 
solving Pi. 

C. Stability against noise? 

Suppose that xq is a linear combination of a few atoms of 
the dictionary, that is, xq = Asq, where Sq is sparse. Now 
consider a noisy measurement of Xq, that is, x = Xq + n, 
where n denotes the noise, and ||n||2 < e. The question of 
'stability' [IJ is then: Even for a very small e, is it guaranteed 
that the sparse decomposition of x over the dictionary (prob- 
lem Pq) is not too different from the sparse decomposition of 
Xq? The answer is unfortunately no, that is, the problem Pq 
can be too sensitive to noise [19J. 

To overcome this problem, it has been proposed in |[T| that 
instead of solving Pq or Pi one considers solving their noise 
aware variants: 



{Po,s) '■ Minimize ||s||o s.t. 
(Pi, 5) : Minimize ||s||i s.t. 



|x-As||2<(5 (4) 
|x-As||2<-5 (5) 



In other words, it has been proposed to do an "approximate" 
decomposition, that is, a decomposition with ||x — As|| < 5 
instead of the exact decomposition x = As. These noise 
aware variants have to be solved for a sufficiently large S, 
that is, for 6 > e to guarantee that the true solution Sp 
satisfies the constraints of the above optimization problems. 
Then, in |1|, the authors prove that both problems Pq s and 
Pi ,5 are stable against noise, that is, the estimation eiTor is 
at worst proportional to the noise level. More precisely, the 
stability of Pq g is given by the following theorem: 

Theorem 3 (Stability of Pq s; theorem 2.1 of [1]): Let M 
denote the coherence of the dictionary A. Suppose that for 
the sparse representation of the noiseless signal xq — Asq 
we have: 

fc^||so||o< (6) 

IfsQ^s denotes the result of applying Pq^s on the noisy data x 
with S > e, then: 

S 



\So,5 - S0II2 < 



e - 



^1 - M{2k - 1) 



(7) 
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Note that (|6]l implies also that the term under the square 
root in (|7| is positive. 

The authors of |1| also prove the stability of Pi 5 for the 
case llsollo < (1 + A/-i)/4. 

A noise aware variant of SLO (called Robust-SLO), has 
already been developed [20 J , which tries to solve directly P^ g 
without passing through Pi g. 

D. Our Contribution 



As it was said in Section II-C the stability of the problem 
Po,i5 has only been shown for the case ||so||o < (1 + M^^)/2. 
This sparsity limit for stability is the same as the sparsity limit 
for the equivalence of Pq ^nd Pi as stated in Theorem |2] 
However, as was stated in Remark 1 after Theorem |2] this 
sparsity limit is highly more restrictive than the sparsity limit 
for the uniqueness of the sparse solution. In other words, 
current results state that although a sparse representation with 
— < Nollo < \spaik{A) is unique, it is not guaranteed 
that Pq ,5 can stably recover this representation in presence of 
noise. 

Maybe the lack of this guarantee had not been important 
before, because, the classic idea for solving Pq was solving 
Pi, and the sparsity limit for the equivalence of these two 
solutions is the same as the sparsity limit for the stability of 
Po,5. However, with new algorithms like SLO or Robust-SLO, 
one can now try to solve Pq .5 directly and without relying on 
Pi 5. Hence it is now important to know whether or not sparse 
representations with ■^^^ — < ||so||o < \spaik[A.) are stable. 

In the next section, we will show that Pq^s is stable for 
the whole sparsity range that guarantees the uniqueness, that 
is, Po a is stable whenever ||so||o < ^spark(A). Moreover, 
we will show that for smaller ||so||o the problem is 'more 
stable', that is, the more sparsity, the more stability. Finally, 
we will show in Section [rV| that this stability not only holds for 
Pq^S, but also holds for any estimation Sq such that ||soj|o < 
isparJc(A) and ||x — Aso||2 < S. 

III. The main theorem 

To state the main theorems, we need first to define two 
notations: 

• Let q = q{A) = spark{A) — 1. Then, by definition, every 
q columns of A are linearly independent, and there is at 
least a set of q + 1 columns which are linearly dependent 
(in the literature, the quantity q is usually called 'Kruskal 
rank' or 'k-rank' of the matrix A). It is also obvious that 
q < n, in which, q ~ n corresponds to the case A has 
the URR 

• Let cr^i),, 1 < J < 9(A), denote the smallest singular 
value among all of the submatrices of A formed by taking 
j columns of A. Note that since every q columns of A are 
linearly independent, we have al^^^ > 0, VI < < q{A). 

Moreover, it is known lU p. 419], ||22l Lemma 3] that if 
we add a new column to a full-rank tall matrix, its smallest 
singular value decreases or remains the same (refer to [22] for 
a simple direct proof). Therefore, ct^), is a decreasing sequence 
in j, that is: 



We are now ready to state the following theorem. 

Theorem 4 (Stability of Po^s)- Suppose that the noiseless 
signal Xq has a sparse representation xo = Asq satisfying 
llsollo < ^sparic(A). Let also x = Xq + n be a noisy 
measurement of Tig and ||n||2 < s. If Sq.s denotes the result 
of applying Pq.s on the noisy signal x with S > e, then: 

S + e 

So, 5 - S0II2 < —Hi^, (9) 



where ^ = 2||so||o- 



Remark 1. Theorem |4| shows that Pq s is stable not only 

MM 1 , »,f-l . H „ . 



for llsollo < 



l+M 



-, but also for the whole uniqueness range 



||so||o < ■^spark{A). The stability is in the sense that the 
estimation error increases at worst proportionally to the noise 
level. Moreover, from (|8]l, the upper bound on estimation error 
decreases or remains the same as the sparsity increases (this is 
because sparser Sq means smaller ||so||o, which implies smaller 
i and hence larger or the same a^^[^). In other words, sparser 
solutions are 'more stable'. 



Remark 2. The main reason for stating Theorem [4] is to 
provide a stability result for the case < ^ = 2||so||o < 

spark{A), because in this case. Theorem |3] provides no sta- 
bility result. Moreover, note that for the case £ < 1 + AI~^, 
in which both bounds (j7]i and (j9]l are applicable, (j9]l provides 
also a tighter bound than (j7|i. This is implied from Lemma 2.2 
of m which states that in this case a^^l > ^\ - M{£ - 1). 

Proof of Theorem^ Let define xq.iS == Aso.5. We write: 

l|xo - xo^ilb = l|x - n - xo.alb 

= ||(x- Aso.i) - n||2 
< ||x- Aso,a||2 + ||n||2 

<e 



<S 



On the other hand: 



xo - xo,5 = A(so - So,d-) = Bv 



(10) 



(11) 



where v is a vector composed of non-zero entries of Sq — Sq s, 
and B is a submatrix of A composed of the columns of A 
corresponding to the non-zero entries of So — §0,5. Since 6 > e. 
So satisfies the constraint of the optimization problem Po,5, and 
hence ||so ,5||o < ||so||o- Therefore Sq — §0,5 has at most £ = 
2||so||o < spark{A) non-zero entries (note that £ < spark{A) 
means £ < q{A)). In other words, B has at most £ < q 
columns, and hence (by having also in mind (jH}): 



lBv||2 > a, 



V||2 



(12) 



Noting that ||v||2 = ||so — §0.5 1| 2, and combining the above 



inequality with (111, we obtain 



|xo - X0,5||2 > <^ml\\sO - §0^5112 



Combining (10 1 and (13 1 gives: 



^U) > ^U+i) > 0, vi< ?■ < 

mm — mm "-'i ^ ^ — J — 



1 



C^mhiNo -So,5|l2 <5 + e 

(8) which completes the proof. 



(13) 



(14) 
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Remark 3. From ^ and i = 2||so||o < ^(A), we may 
replace cr'^^^^ by its worst case to obtain the following looser 
bound, which does not need knowing the value of ||so||o: 

S + e 



\sa.5 - S0II2 < 



(15) 



IV. A GENERALIZED STABILITY THEOREM 

If we carefully re-examine the proof of Theorem [4j we 
notice that the fact that po^iHo < ||so||o is not essential for 
obtaining the looser bound (15 1. Hence, the bound ( [T5] l holds 
not only for the sparse recovery methods based on solving 
Po,i5, but also for any other estimation §0,5 (obtained from 
any sparse recovery algorithm or even simply from a magic 
guess), provided that it satisfies ||so,a-||o < \spark[A) and 
II X — Aso,5-||2 < 5- In other words, not only Po,5 is stable, but 
also any other method for 'approximate ' sparse representation 
is stable provided that it provides a sparse enough estimation. 
More precisely: 

Theorem 5 (Stability of approximate sparse representation): 
Suppose that the noiseless signal xq has a sparse 
representation Xq = Aso satisfying ||so||o < \spark{A.). 
Let also x = xo + n fee a noisy measurement of xo 
and ||n||2 < s. If we have at hand an estimation §0,5 
of the sparse representation coefficients which satisfies 
po,5||o < \spaTk{A.) and ||x — Aso,5||2 < 5, then: 

||so,5 - S0II2 < '^4^, (16) 

Proof: It is easily obtained by following the same steps 
as the proof of Theorem |4] equations ( 10 1 and (111 still hold. 
We then note that: 

||so.5 - sollo < ||§o,5||o + IIsqIIo < spart(A) (17) 



and hence po.a — So||o < (/(A). Consequently, instead of (12i 
we write: 

(18) 



|Bv||2>ai«)||V||2 



which in combination by (10 1 and (111 proves (16 1 



Remark. Note that the condition 5 > e does not explicitly 
appeared in Theorem [5] and is no more essential (while it was 
essential in Theoreml?] because it was necessary to insure 
that Po,5 gives an estimation satisfying po^^Ho < Po||o> 
which was essential in the proof). However, implicitly, the 5 in 
Theorem |5] cannot be too small, because for a very small 6, it is 
possible that there exists no §0^5 satisfying ||x — AS0.5II2 < 5. 

V. Conclusion 

Since minimizing norm has been one of the first and 
most successful ideas for finding the sparsest solution of an 
USLE, some theoretical aspects of the sparsest solution are 
currently too much influenced by the minimization idea. 
Currently, with the availability of the algorithms that try to 
find the sparse solution by means of other approaches, e.g. 
SLO and Robust-SLO, some of the properties of the sparsest 
solution need to be revisited. In this paper, we studied the 



stability of the sparsest solution, and we showed that it is 
stable not only where po||o < (1 + M~^)/2, but also for the 
whole uniqueness range po||o < ^sparJc(A). These results 
prove the practical interest of designing £"-norm minimization 
algorithms, since they can provide a good estimation from 
noisy data, with the weakest condition of sparsity. 
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